在文本中提取时间关系是自然语言理解的一个至关重要但充满挑战的问题。根据事件之间的距离,模型必须学会从事件对周围的本地和全局环境中进行不同的信息以进行时间关系预测。学习如何融合这些信息已证明对基于变压器的语言模型具有挑战性。因此,我们介绍了mulco:多尺度对比的共同训练,这是一种更好地融合本地和全球情境化特征的技术。我们的模型使用基于BERT的语言模型编码本地上下文和图形神经网络(GNN)来表示全局文档级句法和时间特征。与以前的最先进方法不同,该方法在多视图功能上使用简单的串联或使用复杂的强化学习方法选择最佳句子,我们的模型Co-Trains GNN和BERT模块使用多规模的对比度学习目标。 GNN和BERT模块通过将GNN多层多跳子图(即,全局上下文嵌入)和BERT输出(即局部上下文嵌入)进行对比,从而学习了协同参数化。我们从经验上证明,与当前的最新技术相比,Mulco提供了改进的使用Bert和GNN编码的本地和全球环境的能力。我们的实验结果表明,Mulco在几个时间关系提取数据集上实现了新的最新结果。
translated by 谷歌翻译
自然语言理解(NLU)通过大型基准驱动的大规模进展,与转让学习的研究配对扩大其影响。基准是由一小部分频繁现象的主导,留下了一条长长的不常见现象。在这项工作中,我们反映了问题:转移学习方法足够地解决了长尾的基准训练模型的表现吗?由于基准未列出包括/排除的现象,我们使用宏观级别的宏观尺寸(如经验丰富的类型,主题等)概念化。我们评估通过100个代表性论文转让学习的定性荟萃分析来转移学习研究的趋势nlu。我们的分析问了三个问题:(i)哪个长尾尺寸进行转移学习研究目标? (ii)哪种特性有助于适应方法改善长尾的性能? (iii)哪种方法差距对长尾性能有最大的负面影响?我们对这些问题的答案突出了在长尾的转让学习中的未来研究的主要途径。最后,我们展示了一个案例研究,比较了各种适应方法对临床叙事的性能,以表明系统性开展的元实验如何提供能够沿着这些未来的途径取得进展的见解。
translated by 谷歌翻译
从对话数据中提取信息特别具有挑战性,因为以任务为中心的对话的性质可以有效地传达人类隐式信息,但对机器来说是具有挑战性的。话语之间的挑战可能会有所不同,具体取决于说话者在对话中的作用,尤其是当相关专业知识跨角色不对称时。此外,随着对话中隐含地传达的信息构建更多的共享环境,挑战也可能会增加。在本文中,我们提出了新颖的建模方法MedFilter,该方法解决了这些见解,以提高识别和分类与任务相关的话语时的性能,并在这样做时对下游信息提取任务的性能产生积极影响。我们在近7,000次医生对话的语料库上评估了这种方法,其中使用MedFilter来识别与讨论的医学相关贡献(在PR曲线下的面积方面,比SOTA基线提高了10%的贡献)。确定与任务相关的话语受益于下游医疗处理,在提取症状,药物和投诉的提取方面分别提高了15%,105%和23%。
translated by 谷歌翻译
Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.
translated by 谷歌翻译
Many real-world applications of language models (LMs), such as code autocomplete and writing assistance, involve human-LM interaction, but the main LM benchmarks are non-interactive, where a system produces output without human intervention. To evaluate human-LM interaction, we develop a framework, Human-AI Language-based Interaction Evaluation (H-LINE), that expands non-interactive evaluation along three dimensions, capturing (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality. We then design five tasks ranging from goal-oriented to open-ended to capture different forms of interaction. On four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21's J1-Jumbo), we find that non-interactive performance does not always result in better human-LM interaction and that first-person and third-party metrics can diverge, suggesting the importance of examining the nuances of human-LM interaction.
translated by 谷歌翻译
What is a rose, visually? A rose comprises its intrinsics, including the distribution of geometry, texture, and material specific to its object category. With knowledge of these intrinsic properties, we may render roses of different sizes and shapes, in different poses, and under different lighting conditions. In this work, we build a generative model that learns to capture such object intrinsics from a single image, such as a photo of a bouquet. Such an image includes multiple instances of an object type. These instances all share the same intrinsics, but appear different due to a combination of variance within these intrinsics and differences in extrinsic factors, such as pose and illumination. Experiments show that our model successfully learns object intrinsics (distribution of geometry, texture, and material) for a wide range of objects, each from a single Internet image. Our method achieves superior results on multiple downstream tasks, including intrinsic image decomposition, shape and image generation, view synthesis, and relighting.
translated by 谷歌翻译
Training embodied agents in simulation has become mainstream for the embodied AI community. However, these agents often struggle when deployed in the physical world due to their inability to generalize to real-world environments. In this paper, we present Phone2Proc, a method that uses a 10-minute phone scan and conditional procedural generation to create a distribution of training scenes that are semantically similar to the target environment. The generated scenes are conditioned on the wall layout and arrangement of large objects from the scan, while also sampling lighting, clutter, surface textures, and instances of smaller objects with randomized placement and materials. Leveraging just a simple RGB camera, training with Phone2Proc shows massive improvements from 34.7% to 70.7% success rate in sim-to-real ObjectNav performance across a test suite of over 200 trials in diverse real-world environments, including homes, offices, and RoboTHOR. Furthermore, Phone2Proc's diverse distribution of generated scenes makes agents remarkably robust to changes in the real world, such as human movement, object rearrangement, lighting changes, or clutter.
translated by 谷歌翻译
Accurate uncertainty measurement is a key step to building robust and reliable machine learning systems. Conformal prediction is a distribution-free uncertainty quantification algorithm popular for its ease of implementation, statistical coverage guarantees, and versatility for underlying forecasters. However, existing conformal prediction algorithms for time series are limited to single-step prediction without considering the temporal dependency. In this paper we propose a Copula Conformal Prediction algorithm for multivariate, multi-step Time Series forecasting, CopulaCPTS. On several synthetic and real-world multivariate time series datasets, we show that CopulaCPTS produces more calibrated and sharp confidence intervals for multi-step prediction tasks than existing techniques.
translated by 谷歌翻译
We apply Physics Informed Neural Networks (PINNs) to the problem of wildfire fire-front modelling. The PINN is an approach that integrates a differential equation into the optimisation loss function of a neural network to guide the neural network to learn the physics of a problem. We apply the PINN to the level-set equation, which is a Hamilton-Jacobi partial differential equation that models a fire-front with the zero-level set. This results in a PINN that simulates a fire-front as it propagates through a spatio-temporal domain. We demonstrate the agility of the PINN to learn physical properties of a fire under extreme changes in external conditions (such as wind) and show that this approach encourages continuity of the PINN's solution across time. Furthermore, we demonstrate how data assimilation and uncertainty quantification can be incorporated into the PINN in the wildfire context. This is significant contribution to wildfire modelling as the level-set method -- which is a standard solver to the level-set equation -- does not naturally provide this capability.
translated by 谷歌翻译
Electronic health records (EHR) offer unprecedented opportunities for in-depth clinical phenotyping and prediction of clinical outcomes. Combining multiple data sources is crucial to generate a complete picture of disease prevalence, incidence and trajectories. The standard approach to combining clinical data involves collating clinical terms across different terminology systems using curated maps, which are often inaccurate and/or incomplete. Here, we propose sEHR-CE, a novel framework based on transformers to enable integrated phenotyping and analyses of heterogeneous clinical datasets without relying on these mappings. We unify clinical terminologies using textual descriptors of concepts, and represent individuals' EHR as sections of text. We then fine-tune pre-trained language models to predict disease phenotypes more accurately than non-text and single terminology approaches. We validate our approach using primary and secondary care data from the UK Biobank, a large-scale research study. Finally, we illustrate in a type 2 diabetes use case how sEHR-CE identifies individuals without diagnosis that share clinical characteristics with patients.
translated by 谷歌翻译